Selecting the number of components in principal component analysis using cross-validation approximations
نویسندگان
چکیده
Cross-validation is a tried and tested approach to select the number of components in principal component analysis (PCA), however, its main drawback is its computational cost. In a regression (or in a non parametric regression) setting, criteria such as the general cross-validation one (GCV) provide convenient approximations to leave-one-out crossvalidation. They are based on the relation between the prediction error and the residual sum of squares weighted by elements of a projection matrix (or a smoothing matrix). Such a relation is then established in PCA using an original presentation of PCA with a unique projection matrix. It enables the definition of two cross-validation approximation criteria: the smoothing approximation of the cross-validation criterion (SACV) and the GCV criterion. The method is assessed with simulations and gives promising results. Crown Copyright© 2011 Published by Elsevier B.V. All rights reserved.
منابع مشابه
Application of Information Complexity in Principal Component Regression Modeling of the Venturi Meter Drift
In principal component regression there is a problem of selecting the number of principal components to be retained in the model. Those principal components corresponding to near-zero eigenvalues can ruin the precision of the regression coefficients estimator and therefore must be eliminated from the model. However, when the eigenspectrum gradually decays, it is difficult to decide how many pri...
متن کاملDeveloping and Validation of Moral Behavior Styles Inventory
Article history: Received date: 13 September, 2016 Review date: 2 October 2016 Accepted date:20 November 2016 Printed on line: 5 January Purpose: The present study was done to introduce an efficient tool in the field of moral behavior. Material & Method: method of the study was correlational, its approach was test developing and its population was students of Islamic Azad University- Ast...
متن کاملFeature selection using genetic algorithm for classification of schizophrenia using fMRI data
In this paper we propose a new method for classification of subjects into schizophrenia and control groups using functional magnetic resonance imaging (fMRI) data. In the preprocessing step, the number of fMRI time points is reduced using principal component analysis (PCA). Then, independent component analysis (ICA) is used for further data analysis. It estimates independent components (ICs) of...
متن کاملAssessment of Cost Effectiveness of a Firm Using Multiple Cost Oriented DEA and Validation with MPSS based DEA
Data Envelopment Analysis (DEA) is a nonparametric tool for discriminating the best performers from a number of homogenous Decision Making Units (DMU). Cost oriented DEA models identify those best DMUs which run cost efficient process. This paper validates the outcome derived from the Ideal Frontier (mentioned in Sarkar. S (2014)) derived from non-central Principal Component Analysis and a slac...
متن کاملPrincipal Component Analysis for Soil Conservation Tillage vs Conventional Tillage in Semi Arid Region of Punjab Province of Pakistan
Principal component analysis is a valid method used for data compression and information extraction in a given set of experiments. It is a well-known classical data analysis technique. There are a number of algorithms for solving the problems, some scaling better than others. Wheat ranks as the staple food of most of the nations as well as an agent of poverty reduction, food security and world ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 56 شماره
صفحات -
تاریخ انتشار 2012